<!DOCTYPE html>
<html lang="en">
<head>
	<meta charset="UTF-8">
	<meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1">
	<title>短语匹配 | Elasticsearch: 权威指南 | Elastic</title>
    <!-- Give IE8 a fighting chance -->
    <!--[if lt IE 9]>
    <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
    <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
    <![endif]-->
	<link rel="stylesheet" type="text/css" href="../static/styles.css" />
</head>
<body>
<div class="main-container">
    <section id="content">
        
        <div class="content-wrapper">
            <section id="guide" lang="zh_cn">
                <div class="container">
                    <div class="row">
                        <div class="col-xs-12 col-sm-8 col-md-8 guide-section">
                            <div style="color:gray; word-break: break-all; font-size:12px;">原文地址: <a href="https://www.elastic.co/guide/cn/elasticsearch/guide/current/phrase-matching.html" rel="nofollow">https://www.elastic.co/guide/cn/elasticsearch/guide/current/phrase-matching.html</a>, 版权归 www.elastic.co 所有<br/>
                            英文版地址: <a href="https://www.elastic.co/guide/en/elasticsearch/guide/current/phrase-matching.html" rel="nofollow">https://www.elastic.co/guide/en/elasticsearch/guide/current/phrase-matching.html</a>
                            </div>
                        <!-- start body -->
                  <div class="page_header">
<b>请注意:</b><br>本书基于 Elasticsearch 2.x 版本，有些内容可能已经过时。
</div>
<div id="content">
<div class="breadcrumbs">
<span class="breadcrumb-link"><a href="index.html">Elasticsearch: 权威指南</a></span>
»
<span class="breadcrumb-link"><a href="search-in-depth.html">深入搜索</a></span>
»
<span class="breadcrumb-link"><a href="proximity-matching.html">近似匹配</a></span>
»
<span class="breadcrumb-node">短语匹配</span>
</div>
<div class="navheader">
<span class="prev">
<a href="proximity-matching.html">« 近似匹配</a>
</span>
<span class="next">
<a href="slop.html">混合起来 »</a>
</span>
</div>
<div class="section">
<div class="titlepage"><div><div>
<h2 class="title">
<a id="phrase-matching"></a>短语匹配<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elasticsearch-cn/elasticsearch-definitive-guide/edit/cn/120_Proximity_Matching/05_Phrase_matching.asciidoc">edit</a>
</h2>
</div></div></div>
<p>就像 <code class="literal">match</code> 查询对于标准全文检索是一种最常用的查询一样，当你想找到彼此邻近搜索词的查询方法时，就会想到 <code class="literal">match_phrase</code> 查询。</p>
<div class="pre_wrapper lang-sense">
<pre class="programlisting prettyprint lang-sense">GET /my_index/my_type/_search
{
    "query": {
        "match_phrase": {
            "title": "quick brown fox"
        }
    }
}</pre>
</div>
<div class="sense_widget" data-snippet="snippets/120_Proximity_Matching/05_Match_phrase_query.json"></div>
<p>类似 <code class="literal">match</code> 查询， <code class="literal">match_phrase</code> 查询首先将查询字符串解析成一个词项列表，然后对这些词项进行搜索，但只保留那些包含 <em>全部</em> 搜索词项，且 <em>位置</em> 与搜索词项相同的文档。
比如对于 <code class="literal">quick fox</code> 的短语搜索可能不会匹配到任何文档，因为没有文档包含的 <code class="literal">quick</code> 词之后紧跟着 <code class="literal">fox</code> 。</p>
<div class="tip admon">
<div class="icon"></div>
<div class="admon_content">
<p><code class="literal">match_phrase</code> 查询同样可写成一种类型为 <code class="literal">phrase</code> 的 <code class="literal">match</code> 查询:</p>
<div class="pre_wrapper lang-sense">
<pre class="programlisting prettyprint lang-sense">"match": {
    "title": {
        "query": "quick brown fox",
        "type":  "phrase"
    }
}</pre>
</div>
<div class="sense_widget" data-snippet="snippets/120_Proximity_Matching/05_Match_phrase_query.json"></div>
</div>
</div>
<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="_词项的位置"></a>词项的位置<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elasticsearch-cn/elasticsearch-definitive-guide/edit/cn/120_Proximity_Matching/05_Phrase_matching.asciidoc">edit</a>
</h3>
</div></div></div>
<p>当一个字符串被分词后，这个分析器不但会返回一个词项列表，而且还会返回各词项在原始字符串中的 <em>位置</em> 或者顺序关系：</p>
<div class="pre_wrapper lang-sense">
<pre class="programlisting prettyprint lang-sense">GET /_analyze?analyzer=standard
Quick brown fox</pre>
</div>
<div class="sense_widget" data-snippet="snippets/120_Proximity_Matching/05_Term_positions.json"></div>
<p>返回信息如下：</p>
<div class="pre_wrapper lang-js pagebreak-before">
<pre class="programlisting prettyprint lang-js pagebreak-before">{
   "tokens": [
      {
         "token": "quick",
         "start_offset": 0,
         "end_offset": 5,
         "type": "&lt;ALPHANUM&gt;",
         "position": 1 <a id="CO81-1"></a><i class="conum" data-value="1"></i>
      },
      {
         "token": "brown",
         "start_offset": 6,
         "end_offset": 11,
         "type": "&lt;ALPHANUM&gt;",
         "position": 2 <a id="CO81-2"></a><i class="conum" data-value="1"></i>
      },
      {
         "token": "fox",
         "start_offset": 12,
         "end_offset": 15,
         "type": "&lt;ALPHANUM&gt;",
         "position": 3 <a id="CO81-3"></a><i class="conum" data-value="1"></i>
      }
   ]
}</pre>
</div>
<div class="calloutlist">
<table border="0" summary="Callout list">
<tr>
<td align="left" valign="top" width="5%">
<p><a href="#CO81-1"><i class="conum" data-value="1"></i></a><a href="#CO81-2"></a><a href="#CO81-3"></a></p>
</td>
<td align="left" valign="top">
<p><code class="literal">position</code> 代表各词项在原始字符串中的位置。</p>
</td>
</tr>
</table>
</div>
<p>位置信息可以被存储在倒排索引中，因此 <code class="literal">match_phrase</code> 查询这类对词语位置敏感的查询，
就可以利用位置信息去匹配包含所有查询词项，且各词项顺序也与我们搜索指定一致的文档，中间不夹杂其他词项。</p>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="_什么是短语"></a>什么是短语<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elasticsearch-cn/elasticsearch-definitive-guide/edit/cn/120_Proximity_Matching/05_Phrase_matching.asciidoc">edit</a>
</h3>
</div></div></div>
<p>一个被认定为和短语 <code class="literal">quick brown fox</code> 匹配的文档，必须满足以下这些要求：</p>
<div class="ulist itemizedlist">
<ul class="itemizedlist">
<li class="listitem">
<code class="literal">quick</code> 、 <code class="literal">brown</code> 和 <code class="literal">fox</code> 需要全部出现在域中。
</li>
<li class="listitem">
<code class="literal">brown</code> 的位置应该比 <code class="literal">quick</code> 的位置大 <code class="literal">1</code> 。
</li>
<li class="listitem">
<code class="literal">fox</code> 的位置应该比 <code class="literal">quick</code> 的位置大 <code class="literal">2</code> 。
</li>
</ul>
</div>
<p>如果以上任何一个选项不成立，则该文档不能认定为匹配。</p>
<div class="tip admon">
<div class="icon"></div>
<div class="admon_content">
<p>本质上来讲，<code class="literal">match_phrase</code> 查询是利用一种低级别的 <code class="literal">span</code> 查询族（query family）去做词语位置敏感的匹配。

Span 查询是一种词项级别的查询，所以它们没有分词阶段；它们只对指定的词项进行精确搜索。</p>
<p>值得庆幸的是，<code class="literal">match_phrase</code> 查询已经足够优秀，大多数人是不会直接使用 <code class="literal">span</code> 查询。
然而，在一些专业领域，例如专利检索，还是会采用这种低级别查询去执行非常具体而又精心构造的位置搜索。</p>
</div>
</div>
</div>

</div>
<div class="navfooter">
<span class="prev">
<a href="proximity-matching.html">« 近似匹配</a>
</span>
<span class="next">
<a href="slop.html">混合起来 »</a>
</span>
</div>
</div>

                  <!-- end body -->
                        </div>
                        <div class="col-xs-12 col-sm-4 col-md-4" id="right_col">
                        
                        </div>
                    </div>
                </div>
            </section>
        </div>
    </section>
</div>
<script src="../static/cn.js"></script>
</body>
</html>